Beyond MD17: the reactive xxMD dataset

System specific neural force fields (NFFs) have gained popularity in computational chemistry. One of the most popular datasets as a bencharmk to develop NFF models is the MD17 dataset and its subsequent extension. These datasets comprise geometries from the equilibrium region of the ground electronic state potential energy surface, sampled from direct adiabatic dynamics. However, many chemical reactions involve significant molecular geometrical deformations, for example, bond breaking. Therefore, MD17 is inadequate to represent a chemical reaction. To address this limitation in MD17, we introduce a new dataset, called Extended Excited-state Molecular Dynamics (xxMD) dataset. The xxMD dataset involves geometries sampled from direct nonadiabatic dynamics, and the energies are computed at both multireference wavefunction theory and density functional theory. We show that the xxMD dataset involves diverse geometries which represent chemical reactions. Assessment of NFF models on xxMD dataset reveals significantly higher predictive errors than those reported for MD17 and its variants. This work underscores the challenges faced in crafting a generalizable NFF model with extrapolation capability.


Introduction
The development of molecular force fields driven by data is predominantly benchmarked against the MD17 dataset introduced by Chmiela et al. 1 and its extension, the rMD17 2 .These datasets consist dynamic data of ten small to medium-sized gas-phase molecules.In molecular dynamics, data are intrinsically time-series sequences, necessitating careful sampling to prevent unintended information leakage into future states.A detailed analysis of MD17 and its variants reveals a significant sampling bias towards a narrow potential energy surface (PES) region close to the equilibrium structure.This narrow exploration of PES leads to limited conformation and energy space sampling, as our internal coordinate analysis shows.Thus, these datasets are suboptimal in terms of segmentation strategy and the molecular conformation space they cover.
For our discussion, we refer to these conventional molecular dynamics datasets as in-distribution (ID) datasets.Yet, many chemical processes of interest occur out-of-distribution.Consider a basic chemical reaction depicted in Figure 1: the nuclear configuration space includes reactants, transition states, and products.Sampling exclusively from the reactant region fails to capture the full dynamics of chemical reactions.As a result, NFF models trained on such skewed datasets are biased towards reactant configurations, potentially leading to qualitatively inaccurate predictions for a complete chemical reaction.
To overcome these challenges, we introduce the extended excited-state molecular dynamics (xxMD) dataset in this work.The xxMD retains the core objective of capturing trajectory data for small to medium-sized gas-phase molecules but distinguishes itself by incorporating nonadiabatic trajectories which include the dynamics of excited electronic states.Comprising four photochemically active molecules, the xxMD begins with significantly higher initial energies, enabling it to traverse a more extensive nuclear configuration space and more authentically represent the entire chemical reaction PESreactants, transition states, and products.4][5][6][7][8] By including these key regions, the xxMD dataset aims to establish new benchmarks and challenges for NFF models, providing a more comprehensive and chemically accurate dataset for the development of predictive models.
We note that our development of xxMD datasets is not the first attempt ever to try to go beyond the (r)MD17 datasets.For example, the recently developed WS22 database 9 tries to include nuclear configurations from multiple minima and interpolate among these configurations.Although WS22 has gone beyond (r)MD17, the xxMD datasets developed in current work involve much more complex configurations, for example, regions that correspond to conical intersections and avoided crossings.
Existing datasets: MD17 and its variant Chmiela et al. performed adiabatic ab initio molecular dynamics (AIMD) simulations on small gas-phase molecules at room temperature, with the electronic potential energies computed at the Kohn-Sham density functional theory (KS-DFT) level 1 .However, the original publication did not provide detailed specifics about the density functional, basis set, spin-polarization, grid for integration, and the software used.This lack of transparency presents a challenge for reproducibility and may limit the utility of the dataset for certain types of chemical simulation.Addressing the need for clarity, Christensen et al. revisited the potential energies and forces of the MD17 dataset, recalculating them using the PBE density functional with the def2SVP basis set and enhanced grid precision 2 .This effort led to the creation of the rMD17 dataset, which has since been widely adopted in NFF studies 10,11 .Nonetheless, it is crucial to note the limitations of the PBE functional and def2SVP basis set for simulating accurate chemical reactions.While these computational tools can produce a continuous PES that varies with nuclear configuration, their ability to yield accurate results for chemically complex reactions -especially those involving bond breaking and formation -is often questioned.Despite these concerns, the MD17 and its refined counterpart, rMD17, are still considered to be well-behaved datasets for benchmarking purposes within certain constraints.Adiabatic molecular dynamics datasets generated at low energy range are inherently limited in their sampling diversity and may not benefit fully from techniques such as uniform sampling and cross-validation.This is particularly true for adiabatic AIMD simulations, where initial low-energy conditions substantially constrain the nuclear configuration space.This limitation results in trajectories that predominantly occupy the reactant zone of the PES, as depicted in Figure 1.
To evaluate the breadth of configurations in the MD17 and rMD17 datasets, we conducted an analysis focused on internal coordinate distributions for azobenzene (C-N=N-C dihedral angle and the N=N bond length) and malonaldehyde (C-C-C=O dihedral angle and the C=O bond length).These distributions, along with the corresponding relative electronic potential energies and force norms, are illustrated in Figure 2. The visual representation confirms that the internal coordinates distribution is notably narrow.Consequently, we observe a significant overlap between the training and testing samples within these datasets.Such overlap raises concerns about potential data leakage, which could inadvertently lead to overly optimistic results in benchmarking studies, as discussed in the literature [10][11][12][13][14] .The findings underscore the need for datasets that encompass a more diverse and extensive sampling of the PES to ensure robust and reliable benchmarks for NFF models.

Dataset Requirement
In classical MD and adiabatic AIMD simulations, chemical reactions are characterized by the system's transition across different minima on the PES.These transitions correspond to changes in electronic potential energy as the system moves through various nuclear configurations.Systems naturally tend to follow the path of least resistance, referred to as the reaction pathway.To develop accurate NFFs, two fundamental elements are required: a comprehensive quantum chemical dataset that captures the full range of molecular transformations from various regions, and an advanced machine learning model with the capacity to interpolate and extrapolate across the PES.The figure below illustrates typical trajectories on a PES.It's evident that trajectories tend to be localized around the ground state minima.
In contrast, datasets derived from nonadiabatic dynamics simulations are particularly valuable as they provide a more diverse array of nuclear configurations, going beyond the limitations of low energy adiabatic AIMD.These enriched datasets allow for the exploration of PES regions that are critical for understanding complex chemical processes, which are often not adequately represented in low energy adiabatic simulations.

Summary
In summary, the xxMD dataset developed in current work includes four molecular systems: azobenzene, malonaldehyde, stilbene, and dithiophene, with crucial geometries along their reaction pathways illustrated in Figure ??.Notably, azobenzene and malonaldehyde are also part of the MD17 and rMD17 datasets, allowing for direct comparison.
The geometries are sampled from nonadiabatic dynamics.The potential energies and gradients, i.e. forces, for the first three singlet electronic states at the state-averaged complete active state self-consistent field (SA-CASSCF) level of theory 15 are included in xxMD-CASSCF dataset.In addition, spin-polarized KS-DFT with M06 functional 16 calculations are performed on the same geometries as in xxMD-CASSCF dataset, the resulting ground singlet electronic state potential energies and gradients are included in the xxMD-DFT dataset.Therefore, the xxMD datasets developed in current work involve a multi-state dataset -xxMD-CASSCF dataset, and a single-state dataset -xxMD-DFT dataset.

Method
For our xxMD dataset, we employ the trajectory surface hopping (TSH) semiclassical nonadiabatic dynamics algorithm 3,4,17 with SA-CASSCF electronic theory. 15The SA-CASSCF is a multireference electronic structure theory that provides qualitatively correct description of strong correlation -which are critical for deformed geometries and conical intersections, while the linear response time dependent Kohn-Sham density function approximations failed qualitatively. 18,19 e ensured that only energy-conserving trajectories were sampled.The size of the data samples is detailed in Table ?? in supplementary material.
Nevertheless, to ensure compatibility with prevalent datasets like MD17, we also computed single-point spin-polarized KS-DFT (or unrestricted KS-DFT) values.These calculations employ the M06 16 exchange-correlation functional -a notably superior meta-GGA functional relative to PBE.This dual approach culminates in two datasets: xxMD-CASSCF and xxMD-DFT.The former captures potential energies and forces across the first three electronic states for azobenzene, dithiophene, malonaldehyde, and stilbene.The latter provides recomputed ground-state energy and force values, anchored on the same trajectories.All computational details are described in supplementary information section G Computational details.Notice that SA-CASSCF PESs can be more complicated than DFT surfaces due to more complicated electronic structure algorithm from SA-CASSCF, i.e. choice of active space.Both xxMD datasets are structured via a temporal split method, partitioning training and testing data based on trajectory timesteps.1][22][23][24] Second, the purpose of the current work is to provide a database which includes a wide nuclear configuration space for which the energies and gradients of multiple electronic states are available.Therefore, the machine learning force field models can be tested against each surfaces.][27] We evaluated six message-passing NFF models on the xxMD datasets: SchNet 28 , DimeNet++ (DPP) 29 , SphereNet (SPN) 14 , NequIP 10 , Allegro 30 , and MACE 11 .Each model was mostly used with its default parameters, and in line with convention, we trained the NFFs emphasizing more on force losses.While hyperparameter optimization could potentially improve performance (See Supplementary Information for an example), it remains outside the scope of this study.Therefore, the presented results might not showcase the absolute best performance for each model.Given our observations, we encourage researchers aiming to apply NFFs in practical scenarios to conduct rigorous re-benchmarks tailored to their specific chemical systems and objectives.
Temporal splitting was chosen over random splitting to partition the xxMD datasets.This method involves dividing time-series data based on timesteps, reserving a specific range for testing and applying a 50:25:25 split for training, validation, and testing sets.Such a split allows for a rigorous assessment of a model's ability to predict unexplored areas of the PES.This is highlighted in Figure 3, where deviations in trajectories over time emphasize the datasets' capability to challenge and evaluate the extrapolative power of NFFs.However, it is possible to use random splitting on xxMD datasets considering the wide coverage of conformation space.

Data Records
The xxMD-CASSCF and xxMD-DFT datasets have been made publicly available on GitHub at the following URL: https: //github.com/zpengmei/xxMD;and on Zenodo at the following URL: https://doi.org/10.5281/zenodo.10393859. 31These datasets are stored in compressed archives, each containing pre-split extended XYZ format files based on temporal information.The files have been processed using the Atomic Simulation Environment (ASE) software package, as documented in the reference 32 .The GitHub repository is structured into two main directories, each corresponding to one of the datasets: xxMD-CASSCF and xxMD-DFT.
Within each directory, data is further organized into subdirectories named after the four molecules studied: malonaldehyde, azobenzene, stilbene, and dithiophene.Each molecule's subdirectory contains the associated dataset files.Notably, the xxMD-CASSCF dataset includes an additional subdirectory structure that segregates the state-specific data for the first three electronic states.

Dynamic properties
Through the ensemble-averaged radial distribution function (RDF) and mean square displacement (MSD), the xxMD datasets exhibit a comprehensive sampling of the nuclear configuration space, surpassing that observed in MD17.Illustrated in Figure 3, the RDF and MSD track nuclear configurations over time, offering insights into the spatial distribution and mobility of particles, respectively.The RDF measures the likelihood of particle presence at varying radial distances from a reference point, whereas the MSD quantifies the average squared distance that molecules travel over a time interval.
The pronounced shifts in nuclear configurations captured by nonadiabatic dynamics in the xxMD datasets, as reflected in the dynamic breadth of the RDF and MSD, underline the enhanced diversity of PES regions sampled.Consequently, the complexity of mastering the PESs for molecules in the xxMD dataset is expected to be significantly elevated, presenting a robust challenge for the accuracy of NFFs.

Benchmarks on xxMD-CASSCF and xxMD-DFT datasets
We picked six representative equivariant NFFs to benchmark.The hyperparameters and training details of models are described in the supplementary information.We used a weighted loss of 1:1000 on energy and forces.We stress that our purpose is not to perform an extensive comparison of models over multiple choices of hyperparameters.Rather, we limit ourselves to showing the performance of the models in the default configurations.
We first evaluate the regression precision of all models on the first three electronic states, which are labeled as S 0 , S 1 , and S 2 respectively (Label S denotes the singlet spin state which is a widely used notation in photochemistry) by using the temporal splitting approach for data in xxMD-CASSCF dataset.The MAE of the predictive energies and forces for test sets are shown in Table 1.Similarly, we present such results of using xxMD-DFT datasets in Table 2.Additional results on the validation sets are available in the supporting information.Note that validation sets depict the nuclear configurations that are closer to the training sets due to the temporal splitting.Therefore, the MAE shown in validation sets are in general lower than that for test sets.

Comparison with existing datasets
In this section, we analyze model behavior for two molecules, namely azobenzene and malonaldehyde.These two molecules are both available in xxMD and (r)MD17 datasets.Benchmarks for (r)MD17 reveal that the accuracy of MACE, NequIP, and  SPN exceeds that of traditional electronic structure methods 10,11,14,33 .It's essential to note that typical errors for KS-DFT in predicting relative transition state energy can be several kcal/mol.For instance, the MAEs of HTBH38 (Hydrogen transfer barrier heights) and NHTBH38 (non-Hydrogen transfer barrier heights) databases are about 9.1 kcal/mol for PBE and 2.4 kcal/mol for M06.Thus, an NFF fitting error below 50 meV would surpass the accuracy of modern density functional calculations.However, such claims are pertinent mainly to ground state potential energies, given that excited state calculations are often less precise.Therefore, given the reported MAEs, these NFF models perform admirably on (r)MD17 datasets.However, this conclusion might be deceiving.Previous discussions highlight the constrained nuclear configuration space in MD17 and rMD17.A comparative analysis of MAEs for the six NFF models on azobenzene and malonaldehyde from xxMD-DFT and (r)MD17 is presented in Table 3. Literature-derived MD17/rMD17 results indicate that all models used 1,000 training samples 10,11,14 .Predictably, the predictive prowess of NFF models diminishes when applied to the xxMD dataset.
The differences of MAEs for a same NFF model for rMD17 and xxMD come from two aspects, namely, the differences in dataset, and the differences in splitting method.The xxMD datasets contain much more complex nuclear configurations than (r)MD17.For the splitting method, one can have either random splitting or temporal splitting.For certain purposes, for example, if one uses the trajectory data to construct a global PES for the system, random splitting would be a good approach.For purpose of extended trajectory simulation with existing trajectory data, temporal splitting may be favored.Because the ultimate goal is to look for unknown chemical events that may not be observable from short trajectory simulations.In that spirit, we use temporal splitting in the current work.For the purpose of extended trajectory simulation, random splitting, which has been used to test against (r)MD17 dataset, means a severe leakage of future information.In practice, if we would like to model a chemical reaction, it would be impractical to manually sample every relevant region on the potential energy surfaces.Therefore, it is a desired property for an NFF model has the capability of physical extrapolation to some extent.Physical extrapolation is achieved in several models, for examples, reactive force field, 34 and parametrically managed activation function. 35he effectiveness of NFF models largely depends on the datasets they are benchmarked against.Historically, the (r)MD17 datasets have been the gold standard for this purpose.However, our study highlights the potential shortcomings of relying solely on (r)MD17 datasets.Given that they primarily capture a narrow nuclear configuration space from low energy ground state AIMDs, they fall short of encompassing the holistic nuclear configuration pertinent to chemical reactions.Training NFF models on such datasets can be somewhat trivial and could result in misleading conclusions about their true capabilities.For instances, computational chemists have a long history of using system specific force fields, which can be easily developed by computing a hessian at the ground state equilibrium geometry. 36,37  address this gap, we introduced the xxMD dataset, derived from nonadiabatic dynamics trajectories.The xxMD dataset offers a comprehensive representation of the nuclear configuration space, encapsulating the reactant, transition state, product, and conical intersection regions of PESs.Its inclusion of several low-lying excited state potential energy surfaces underscores its importance and the challenges it presents for NFF model development.Our benchmarks of prevailing NFF models on the xxMD dataset have revealed pronounced difficulties.Utilizing default hyperparameters, the chosen NFF models struggled to offer quantitatively or even qualitatively accurate force field models for specific systems.We anticipate that our findings will galvanize the community towards pioneering more advanced NFF models better equipped to study intricate chemical reactions.

A Preliminaries of dynamics
In the realm of quantum mechanics, the behavior of nuclei is ideally described by the time-dependent Schrödinger equation.Yet, practical computation limits restrict nuclear quantum dynamics simulations to small systems with just 5 or 6 atoms.Consequently, in many cases, the nuclei are treated as classical particles.This premise paves the way for classical Molecular Dynamics (MD) and adiabatic Ab Initio Molecular Dynamics (AIMD), wherein the dynamics are propagated based on a single electronic state.
At the heart of classical MD is the Newtonian equation of motion: where m i denotes the mass of atom i, r i its position, and F i the force exerted on it.This force can be described as the negative gradient of the potential energy V at the atom's location: The ground state electronic potential energy, V (r i ), in the absence of an external field, forms the basis for the PES.Classical force fields offer an analytical approximation of this energy based on nuclear configuration: This classical approximation often falls short under quantum mechanical scenarios, particularly during bond breaks, necessitating improvements in force field formulations.Upon electronic excitation, as observed in solar cells or photochemical reactions, nuclei confront electronic potentials beyond the ground state.Herein, dynamics involving multiple electronic states emerge.Nonadiabatic dynamics, particularly pertinent when energy levels soar, may either adopt the trajectory surface hopping method or the semiclassical Ehrenfest dynamics, depending on the specific conditions.

B Brief introduction of chosen neural force fields
In this study, we picked six representative neural network architectures for NFF applications, namely, SchNet ?, DPP ?, SPN ?, NequIP ?, Allegro ? and MACE ? .In general, those approaches can be devided into two catagories based on the representation of the feature space.SchNet, DPP and SPN are the so-called scaler-based NFFs, while NequIP, MACE and Allegro are vector-based NFFs, as we summarized in Table S1.
The key concept in SchNet is the continuous-filter convolution, which involves two steps: interaction and update.In the interaction step, the model calculates pairwise interaction features between all atoms based on their distances, using a set of radial bessel basis.The update step then uses these interaction features to update the atom-centered descriptors.In DPP, a higher-order feature, bond angle has been introduced to enhance the expressiveness of the neural network.DPP uses a concept called spherical functions to account for the directionality of the interactions between atoms.The DPP architecture uses 'interaction blocks' to propagate information through the molecular graph.Each interaction block consists of a radial and a spherical part.The radial part captures the distance-based interactions, similar to SchNet.The spherical part captures the angular interactions among any three atoms in the molecule, which is unique to DPP.As a continuation of the DPP, SPN further  3) relative displacement vectors between any two atoms in the molecule.These networks use the representation theory of the threedimensional orthogonal group to construct neurons that obey equivariance with respect rotations and reflections of a molecular system's pose.We visualize this concept in Figure S1.Atomic types are embedded as node features, and relative displacement vectors that contains the positional information are converted into activations that transform according to irreducible representations (irreps) of the orthogonal group.Nonlinearities for these activations are constructed using the tensor product, followed by applying the Clebsch-Gordon decomposition to convert the product back into irreducible components.

C Timing
In this practical view, we present a comprehensive analysis of the operational time of multiple NFFs examined in our study as illustrated in Figure S2.It is important to note that the specific runtime of each NFF model is contingent upon the chosen setup and hyperparameter selection.For example, the radius cutoff utilized for generating locally fully-connected graphs can yield varying numbers of edges and nodes in each mini-batch.Within our findings, we have diligently reported the time required for processing each sample in a mini-batch using the designated hyperparameters.Consequently, we emphasize that while we employed mostly default hyperparameters as a practical reference.

D Additional illustration of xxMD datasets
Here we provide addition illustration (Figure S4) of the xxMD-CASSCF datasets with the ground-state energy and forces as the internal coordinate analysis of MD17.For azobenzene, the primary reaction path involves the cis-trans isomerization of the two phenyl groups along the N=N bond.For malonaldehyde, the reaction path involves either a H-H cis-trans isomerization occurs along the O=C bond or a O-O cis-trans isomerization occurs along the carbon skeleton.The reaction path of stilbene involves the cis-trans isomerization of the two phenyl rings along the C=C double bond and the flip of the phenyl rings to opposite directions.The reaction path of dithiophene is also the cis-trans isomerization of two five-member rings along the C=C double bond.

E Benchmark results on the validation sets of xxMD-CASSCF and xxMD-DFT F Addition experiment of hyperparameter tuning
We would like to stress again, our purpose is to give a initial view of the datasets using common hyperparameters without tuning, and we don't aim to strictly test models listed.We left most hyperparameters unchanged as default, and uses a loss weight heavily focused on the forces following the literatures ?, ?, ? .However, users should carefully use the hyperparameters before applying to specific chemical problems.
We used a default MACE model and one subset of xxMD-CASSCF dataset and varied the weights on the energy and forces, and we found that by simply tuning this hyperparameter, MACE would perform noticeably differently.For instance, the regression accuracy on force is not improved and accuracy on energy deteriorates quickly when the weight on the force gradually increase from 1 to 1000.On the contrary, putting slightly more weights on the energy greatly improve the overall performance, as we laid out in Table S4.Thus, we would like to leave a note to future users that exploring the hyperparameter spaces is important.

G Computational details
The active space and basis set used for SA-CASSCF for all four molecules are shwon in Table S5.The total number of trajectories simulated are vary, but finally selected number of points for each molecule in xxMD dataset is summarized in Table S6 as well.These points are selected from energy conserving trajectories only, and we used the criteria for the total energy conservation as listed in Table S6.Therefore, all trajectories fail to conserve the total energy below the threshold are discarded.We show the total energy conservation in Figure S5 G.1 Dynamics Initial conformations are generated by Wigner-Sampling of the optimized ground-state structure with the same level of electronic structure method.For each conformation, a single-point calculation is performed to acquire the energy of states without spin-orbit calculations.To select initial excited-states, the MCH representation of the Hamiltonian is used to simulate delta-pulse excitation based on excitation energies and oscillators strengths with an excitation window of 0.0 to 10.0 eV.
For azobenzene, we conducted 300 fs SHARC dynamics simulations with a time step of 0.5 fs.For dithiophene, we conducted 500 fs SHARC dynamics simulations with a time step of 0.5 fs.For malonaldehyde, we conducted 300 fs SHARC dynamics with a timestep of 0.25 fs.For stilbene, we performed 500 fs SHARC dynamics with a time step of 0.5 fs.Local diabatizatrion scheme was used to calculate the non-adiabatic coupling vectors by calculating the overlap matrix of wavefunctions between steps.Non-adiabatic coupling vectors are included in the gradient transformation.kinetic energy are adjusted by rescaling the velocity vectors during a surface hop.When the surface hop is refused due to insufficient energy, the velocity doesn't reflect at a frustrated hop.Default energy-based decoherence scheme was used for decoherence correction.The standard SHARC surface hopping probabilities was used as the surface hopping scheme.All gradients and non-adiabatic couplings of active states were calculated at each time step.For azobenzene, dithiophene, malonaldehyde, and stilbene the threshold of total energy was set to 0.6 eV, 0.2 eV, 0.3 eV and 0.2 eV.

G.2 Complete Active Space Self-Consistent Field (CASSCF)
In quantum chemistry, accurately capturing electron correlation-the interaction of electrons relative to one another-is pivotal for an in-depth understanding of a molecule's electronic structure.While standard methods like Hartree-Fock (HF) have their strengths, they can falter in specific scenarios.This is where the CASSCF method becomes instrumental.Central to CASSCF is the categorization of molecular orbitals into three distinct groups: 1. Inactive (core) orbitals: These are fully occupied orbitals, exempted from the correlation treatment.

Active orbitals:
A defined number of electrons within these orbitals undergo correlation across a predetermined set of orbitals.The flexibility in electron configuration within the active space encapsulates static electron correlation.
3. Virtual (secondary) orbitals: Remaining unoccupied, these orbitals are sidelined from the primary correlation procedure.
The CASSCF methodology initially optimizes the active space orbitals employing a comprehensive configuration interaction (CI) calculation.This act of considering all plausible electron configurations within the active ambit captures static correlation.To address dynamic correlation, supplementary methods, like multi-reference perturbation theory (MRPT), are often invoked.
Advantages of CASSCF: • Offers a harmonized treatment of electron correlation.
• Particularly apt for systems with closely-spaced electronic states, encompassing transition states, metal complexes, and excited states.
However, one should note the substantial computational demands, especially with enlarging active spaces, which can potentially restrict its application or mandate approximate solutions.
For SA-CASSCF calculations, OpenMolcas 22.06 was used, which is available at https://gitlab.com/Molcas/OpenMolcas.The active space orbitals of the starting configurations are listed as following (Figure S6).

Figure 1 .
Figure 1.Trajectories on a representative potential energy surface.The contour plot represents the energy landscape, with the color gradient indicating various energy levels.Trajectories are usually confined to regions near the minima, reflecting the system's preference for low-energy states close to or at equilibrium.

Figure 2 .
Figure 2. Illustration of training and testing sets using the reference split indices for azobenzene and malonaldehyde datasets in rMD17.The X-axis depicts dihedral angles (marked by 'C', 'N', and 'O'), the Y-axis denotes bond distances (highlighted by bold letters), and the Z-axis shows relative energy.Training and testing samples are differentiated by color, correlating to force norms.Note that training samples overlap with testing ones.

Figure 3 .
Figure 3.Comparison of Average RDFs and MSDs Across Multiple Trajectories.Each row corresponds to a group of trajectories, with RDF on the left (indicating particle density as a function of distance) and MSD on the right (showing particle displacement over time).Shaded regions represent standard deviations.

Figure S1 .
Figure S1.(a) depicts three atoms with their relative displacement vectors.(b) illustrates the details of atomic embedding with E(3) equivariant activations based on spherical harmonics and one-hot encoding for chemical elements.(c) gives an illustration of spherical harmonics with quantum numbers L = 0, 1, 2. The Clebsch-Gordan coefficients are used during the aggregation step to ensure rotational equivariance when combining activations with different irreps.

Figure S2 .
Figure S2.Comparison of average computational time for NFFs.The timing is specific to the chosen hyperparameters.All NFFs, except MACE, operate with single precision.Generally, group-equivariant NFFs are significantly more computationally expensive.

Figure S3 .
Figure S3.Schematic representation of the photodynamic processes featured in the xxMD dataset.

Figure S4 .
Figure S4.Illustration of xxMD datasets using similar internal coordinates as MD17 analysis.Only the ground-state (in black/white color scheme) energies are visualized for clarity.This figure clearly indicates the breadth of the conformation space explored by using direct non-adiabatic dynamics as compared to MD simulation with room temperature.

Figure S5 . 11 Figure S6 . 11 Figure S7 .
Figure S5.Illustration of total energy conservation over the simulation time of trajectories in xxMD-CASSCF datasets.All trajectories follow the total energy conservation threshold.

Table 1 .
Comparison of predictive MAE of energy(E, meV) and forces(F, meV/A) on hold-out testing set for different models on temporally split xxMD-CASSCF datasets and tasks.

Table 2 .
Comparison of predictive MAE of energy(E, meV) and forces(F, meV/A) on hold-out testing set for different models xxMD-DFT datasets and tasks with temporal split.

Table S2 .
Comparison of predictive MAE on validation set for different models on temporally split xxMD-CASSCF datasets and tasks.Energy(E) has the unit of meV, while forces(F) have the unit of meV/A.

Table S3 .
Comparison of predictive MAE on validation set for different models xxMD-DFT datasets and tasks with temporal split.Energy(E) has the unit of meV, while forces(F) have the unit of meV/A.

Table S4 .
Predictive MAE of energy (meV) and forces (meV/A) on the ground-state azobenzene in xxMD-CASSCF dataset using various loss weights and default MACE model.